Study of Asian indexes by a newly derived dynamic model

We take the stock prices as a dynamic system and characterize its movements by a newly derived dynamic model, called the new Price Reversion Model (nPRM), for which the solution is derived and carefully analyzed under different circumstances. We also develop a procedure of applying the nPRM to real daily closing prices of a stock index. This proposed procedure brings a different perspective to the study of stock prices based on thermodynamics, and the time varying coefficients in the nPRM offer economic meanings of the stock movements. More specifically, the average of smoothed historical data A in the nPRM, analogous to the environment temperature in the Newton’s law of cooling, represent an implied equilibrium price. The heat transfer coefficient κ is adapted to be either negative or positive, which illustrates the speed of convergence or divergence of stock prices, respectively. The empirical study of ten Asian stock indexes shows that the nPRM accurately characterizes and forecasts the market values.


Introduction
The stock index, a key variable reflecting the economic status in an area, is a popular research topic in macroeconomics. The market prices of an individual stock reflect the company's existing value or future profit growth. A stock index, measured by the weighted mean prices of selected stocks in a stock market, represents the stock market's performance and potential competitive ability. Investors who have different motives for participating in the market concern about the changes of a stock or a stock index's market prices of different scales, intraday high-frequency data, daily closing prices, weekly or monthly summary, and etc. Accurately forecasting the stock movements could help the investors with decisions about investment strategy.
Whether the stock prices can be forecasted has been repeatedly discussed in finance [1,2]. The efficient market hypothesis [3] forms a foundation of the theory that under what circumstances the stock prices is predictable. The efficient market hypothesis states that three types of is another task, which can be tackled by the tree-based models [23] to predict the stock price direction. This paper considers a physical-mathematical method, the approach of the differential equations and the dynamic systems. From thermodynamics, we adopt the Newton's law of cooling to describe the mean reverting of the stock index prices. In recent years, modeling the stock index from the physical perspective by treating it as a dynamic system is emerging [26][27][28][29]. To the best of our knowledge, the first attempt to study the stock index, the Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX), as well as the options through dynamic models [26], which are derived based on parabola approximation [30]. Another study [24] also considers the stock markets as complex dynamical systems, but their interactions are analyzed by methods of signal processing, the EMD and the EEMD along with detrended cross-correlation analysis (DCCA). Existing studies recognize the nonlinear behaviors of the stock movements [31][32][33][34]. However, the models fits [33,34] are obtained via statistical method during a period of up trend or down trend.
Based on the related work [27], we consider to apply the Price reversion model (PRM), which has a form of the Newton's law of cooling, to the stock markets. Yet, the empirical evidence in [27] shows that the PRM tends to underestimate the forecasts. Therefore, we identify the reasons of inaccurate forecasts by the PRM and propose to model the stock prices by a newly derived model, i.e. the new Price Reversion Model (nPRM). For the same dynamic model described using the differential equation in the PRM, we derive its solutions rigorously so that the solutions of the nPRM can better fit the realistic stock prices of the markets than the PRM. Two major contributions of the nPRM are as follows. Firstly, the nPRM seeks for the solutions that will not only achieve higher prediction accuracy but also provide useful economic meanings (refer to Figs 1 and 2 below). The coefficients in the nPRM, A and κ, represent the implied equilibrium price, and the convergence speed to the implied equilibrium price, respectively. Secondly, for many predictive models involved in the time series analysis, it is necessary to transform the stock prices to other economic variables, usually the returns or the price increments. The nPRM can make the predictions directly for the stock prices, which implies that the model considers all daily information in the analysis. Furthermore, the conduction of the proposed procedure is simple and convenient.

Methods
We treat the stock index, denoted as S(t), as a dynamic system, which is a differentiable function with respect to time t. We model the subject by the nPRM, represented by a differential equation as follows.
where S(t) is the value of the subject at time t, A and κ are known constants for t 2 [t 0 , t 1 ]. We further denote vðtÞ ¼ dSðtÞ dt and aðtÞ ¼ d 2 SðtÞ dt 2 , which represent the velocity and acceleration of the dynamic system. The system along with its velocity and acceleration at specified time points t 0 and t 1 are denoted as (S(t 0 ), v(t 0 ), a(t 0 )) = (S 0 , v 0 , a 0 ) and (S(t 1 ), v(t 1 ), a(t 1 )) = (S 1 , v 1 , a 1 ) respectively for simplicity.
The Eq (1) is adapted from the Newton's law of cooling [35]. In the Newton's law of cooling, κ is always negative so that the temperature of the subject converges to the environment/ surrounding temperature. Additionally, when the difference between the subject temperature and the environment temperature is large, the absolute value of κ is large and the speed of convergence is high. When the subject temperature is closer to the environment temperature, the absolute value of κ is small, which results in a slower convergence.
Consider the differential Eq (1), κ and A are constant during the period t 2 [t 0 , t 1 ]. Generally they are time related in a dynamic system for a longer time period. Furthermore, the timerelated coefficients are often very dynamic in real-world data. For applications, the coefficients are assumed to be constant in a relatively small interval with given values between time t 0 and t 1 .
For the consistence of this paper, let us first review the PRM in the following subsection.

The Price Reversion Model (PRM)
In the existing study [27], the PRM assumes that the external force equals zero at A. The second derivative is ignored in the Eq (1). Depending on the values of S(t), S 0 , and A, four cases are considered and the solutions are listed as follows.
Case C. If S(t) < S 0 and (S(t) − A) � (S 0 − A) > 0, Case D. If S(t) < S 0 and (S(t) − A) � (S 0 − A) < 0, However, essentially the situations represented by the Eqs (3) and (5) does not exist since the external force equals zero at A. It motivates us to re-derive the nPRM in the next subsection.

Derivation of solutions to the new Price Reversion Model (nPRM)
In Eq (1), A is an equilibrium price of a stock index for a specific market, which is analogous to the environment temperature, and κ is a growth coefficient. The force on the stock price draws it near the equilibrium if κ < 0 or pushes it away from the equilibrium if κ > 0. The nPRM is a more generalized dynamic system than that described by the Newton's law of cooling. Therefore, in the nPRM κ can be either positive or negative. When κ < 0, the nPRM is similar to the thermodynamic system; when κ > 0, the behavior of the dynamic system is a different story.
If κ > 0 in (1), A is a repeller. Fig 1 is the phase plot of nPRM when κ > 0. The first zone demonstrates the situation that the force pushes the price S(t) away from A if the initial price S 0 is smaller than A. In the second zone, the force repels the price S(t) from the equilibrium price A if the initial price is larger than A. On the contrary, if κ < 0 in (1), A is an attractor. Fig  2 is the phase plot of nPRM when κ < 0. When the initial price is higher than A, the force draws the price to the equilibrium price A as the figure demonstrates in the third zone. In the fourth zone, if the initial price is lower than the equilibrium A, the price converges to A.
We solve the Eq (1) of the nPRM piecewisely as follows. We divide S(t) into intervals by the critical point, A, where dSðtÞ dt j S¼A ¼ 0. As a result, S(t) is strictly monotonic in each of the intervals. From Eq (1), after the change of variables we have R SðtÞ Then, we have Case 1. Given S 0 < A and dSðtÞ dt < 0 for t 2 [t 0 , t 1 ] as in the first zone in Fig 1, we obtain κ > 0. Therefore, the theoretical value S(t) is strictly decreasing in t, and S(t) < A for t 2 [t 0 , t 1 ]. The solution blows up since Given S 0 > A and dSðtÞ dt > 0 for t 2 [t 0 , t 1 ] as in the second zone in Fig 1, we obtain κ > 0. Therefore, the theoretical value S(t) is strictly increasing in t, and S(t) > A for t 2 [t 0 , t 1 ]. The solution blows up since Given S 0 < A and dSðtÞ dt > 0 for t 2 [t 0 , t 1 ] as in the third zone in Fig 2, we obtain κ < 0. Therefore, the theoretical value S(t) is strictly increasing in t, and Given S 0 > A and dSðtÞ dt < 0 for t 2 [t 0 , t 1 ] as in the fourth zone in Fig 2, we obtain κ < 0. Therefore, the theoretical value S(t) is strictly decreasing in t, and A comparison of the cases on different conditions between PRM and nPRM is given by Table 1. As mentioned before, cases B and D in the PRM do not exist due to zero external force at A. By examining the phase plots of Figs 1 and 2, the case A of the PRM can be further divided into cases 2 and 4 of the nPRM. The case C of the PRM can be further divided into cases 1 and 3 of the nPRM. We derive solutions for the cases 3 and 4 of the nPRM, which can not be obtained identically from the solutions of the PRM. As a result, the theoretical values obtained via the PRM are not so accurate as those from the nPRM.

Implementation details of the nPRM
The differential Eq (1) characterizes a dynamic system, whose movement is continuous and at least second-order differentiable. However, the real data of daily stock prices consist of two parts, the trajectory of the dynamic system and the noise. The trajectory of the dynamic system

Model
Case is the functional part while the noise is random. Modeling the noisy data, the stock index for instance, by a differential equation is not appropriate. The reasonable approach is to model the functional movement that can be characterized by the differential equation after filtering out the random noise. We establish a procedure of 4 main steps for applying the nPRM to daily stock prices as follows.
Step 1-Smoothing. Moving average is a simple and commonly used filtering method, and hence we implement this method on stock data. We filter out the noise of the data, and define the smoothed stock prices as Afterwards, we apply model (1) to the smoothed data and obtain the theoretical values. In order to obtain accurate calculation of κ, the adequate smoothing period M has to be specified. For this purpose, we estimate the smoothing period M in the training set, which we will further explain later. This smoothing step is not involved in the procedure of PRM [27]. Nevertheless, directly modeling the market prices of a stock index suffers from severe fluctuations in coefficient values resulted from random part of the price movements, the Brownian motion. Characterizing the smoothed stock prices by the differential Eq (1) is more reasonable so that this step is crucial in our proposed method nPRM.
Step 2-Calculating A. To solve (1) we have to know the fixed value of A for a given time period [t 0 , t 1 ] in advance. However, in reality the equilibrium price A is unknown so that the estimation of A is required. A is calculated using smoothed historical data before the period [t 0 , t 1 ]. More specifically, we calculate A as that is the average of stock prices during a period earlier than t 0 . The length of the time interval [t 0 − N + 1, t 0 ] should be carefully chosen. We also determine the time period N based on the training set.
Step 3-Calculating and controlling κ value. During the solving of model (1), the value of κ is first calculated from by substitutingSðtÞ for S(t) in (1). The value of κ is constant during a short period [t 0 , t 1 ]. However, it changes over time, and thus we further denote it as κ t , t = t 0 , � � �, t τ for different time intervals. When κ t > 0 and has large value, the theoretical value at time t will explode. Nevertheless, this situation does not correspond with the changes of stock prices in the real market. To avoid generating unreasonable theoretical values, we restrict the values of κ t lower than an upper bound. A gradient control method [36] is proposed to control the values of the gradient in a reasonable range. The κ t is a one-dimensional gradient in the nPRM, and we apply this method to κ t and restrict its value as where κ t,Q3 is the third quantile from previous values of κ i , for i = t − j, � � �, t − 1 and j � 100 in the series we obtained. The pool of κ i 's comprises at least 100 values from historical data so that we can obtain stable values of k � t .
Step 4-Making τ-step forecasts. The theoretical values of the stock prices during time [t 0 , t 1 ] are obtained by the solution (6), in which S 0 are replaced byS 0 . Assuming the coefficients unchanged for a longer period [t 0 , t τ ], we can obtain theoretical values before time t τ . For discrete time representation, we obtain theoretical daily closing prices at t = t 0 , t 1 , � � �, t τ as the τ-step forecasts.

Model evaluation
The forecasting error is measured by three deviation-type measurements, the Mean Absolute Percentage Error (MAPE), the Mean Absolute Error (MAE), and the Root Mean Square Error (RMSE), along with three trend-type measurements, directional Symmetry (DS), correct uptrend (CP) and correct down-trend (CD). The definitions are as follows.
if ðŜðiÞ ÀŜði À 1ÞÞ > 0 and ðŜðiÞ ÀŜði À 1ÞÞðSðiÞ À Sði À 1ÞÞ � 0; 0; otherwise; if ðŜðiÞ ÀŜði À 1ÞÞ < 0 and ðŜðiÞ ÀŜði À 1ÞÞðSðiÞ À Sði À 1ÞÞ � 0; 0; otherwise; whereŜðiÞ is the theoretical value of the model at time i, S(i) is the market value of the data at time i, T is the length of the forecasts, T 1 is the number of data points belonging to up trend, and T 2 is the number of data points belonging to down trend.
A smaller value of the MAPE indicates that the model better describes the movements of the stock prices. The MAE and the RMSE show similar tendency with the MAPE. DS measures the frequency of correct directional forecasts. CP and CD measure the frequency of correct directional forecasts when real market values rise and fall, respectively. The larger values of DS, CP, and CD suggest that the model obtains more correct directional forecasts in general, for uptrend, and for downtrend, respectively.

Results
For application to real data, we split the data into a training set and a testing set, consisting of the earlier 80% and the latest 20% of the data, respectively. We use the training set to specify adequate smoothing period and the time interval for estimation of the equilibrium, that is, the appropriate choices of M and N in (7) and (8), respectively.

Applied data
The empirical study was conducted based on information from real daily closing price data of ten stock indexes from 2009 to 2019. The indexes include Taiwan TAIEX, Japan Nikkei 225, Korea KOSPI, Hong Kong Hang Seng Index, the Philippines PSEi Index, Thailand SET Index, India S&P BSE SENSEX, Singapore Straits Times Index (STI), Indonesia JKSE, and Malaysia KLCI. The data are collected from Refinitiv Datastream. Each stock index is viewed as a dynamic system and characterized by the nPRM separately.
We implemented the nPRM with a few choices of M and N to obtain τ-step forecasts for the training dataset of each stock index. The values of M range from 2 to 10 while the choice of N includes 30, 60, 90, 120, and 180. We set the number of forecasting steps τ = 5. We assess all combinations of M and N by the criterion of the grand mean of MAPEs across 1-step to 5-step forecasts in the training set. Afterwards, we select the combination of the best performance as the coefficient setting for prediction in the testing set. Table 2 shows that the best combination of M and N with the smallest grand mean of MAPEs in the training set for all the ten indexes. In the following, the forecasts are conducted with M and N specified for each stock index in Table 2.
We conduct all the analyses on a computer with an Intel(R) Core(TM) i7-8565U Processor at 1.80GHz with 24.0 GB RAM running Windows 10. All computations are performed using R version 4.0.3 [37] and the R package Metrics.

Empirical results
The main reason of underestimation of the PRM in the existing study [27] is the difficulty in determining the sign of the growth coefficient, κ, during the integration. In the proposed nPRM procedure, we derive the solution to the differential equation of the nPRM and propose adequate economic explanations of κ. Additionally, when κ > 0 we control its value by a gradient control method [36], and the nPRM is therefore more suited to real markets.   Table 3 displays the MAPEs of applying the investigated methods to all of the ten Asian Indexes from 1-step forecasts to 5-step forecasts. Since the performances of three deviationtype errors (MAPE, MAE, and RMSE) are similar, we simply report and discuss the results of the MAPEs. From Table 3, the martingale (see details in S1 Appendix) has the smallest errors followed by those from the nPRM, and the curve fitting (see details in S2 Appendix) has the worst performance. Table 4 displays the directional symmetry (SD) of the nPRM and the curve fitting from 1-step forecasts to 5-step forecasts for ten Asian stock indexes. The trend-type errors cannot be applied to the forecasts of the martingale since the martingale forecasts the price of the next day by the present price. Therefore, we only compare the nPRM and the curve fitting here. Table 4 shows that nPRM and the curve fitting have similar percentage of correct trend forecasts. The differences between up-trend forecasts and down-trend forecasts of the nPRM and the curve fitting are also not significant (see details in S1 and S2 Tables).  Apart from the martingale, we can investigate the level of market efficiency by sample entropy [38,39]. Sample entropy measures complexity. When we have a time series of length n as {S(1), S(2), � � �, S(n)}, a template vector of length m is defined as S m (i) = {S(i), S(i + 1), � � �, S (i + m − 1)} and the distance function d[S m (i), S m (j)], i 6 ¼ j is defined as the Chebyshev   distance. Then, we define the sample entropy of the series, SampeEn, as where B m is number of template vector pairs having d[S m (i), S m (j)] � r, and r is the tolerance. The value of sample entropy is nonnegative. The value 0 of sample entropy indicates that the series is perfectly regular without noise while a high value reflects randomness and unpredictability. If we examine the market efficiency from the perspective of sample entropy, a high value of sample entropy indicates a market with high level of efficiency. To show how large the possible value of sample entropy from a highly random series can be, we simulate a series of length 180 from a normal distribution with zero mean and unit variance for 1000 times. The average value of sample entropy is 2.21. Table 5 shows the mean value of sample entropy calculated from all series used in estimation of A and κ to make forecasts dynamically in the testing set for ten indexes in our empirical study. The mean values range between 0.5952 (Korea KOSPI) and 0.7833 (the Philippines PSEi), which represent similar complexity among these markets. We suspect that the forecasting errors should be larger for markets with higher level of efficiency since it is harder to predict the series with random noise. By the order of the mean sample entropy values and the MAPEs for the ten stock indexes, the nPRM shows a tendency toward better forecasts for series with lower market efficiency. That is, the forecasting ability of the nPRM usually increases as the mean sample entropy decreases. However, there is an exception of Taiwan TAIEX. The mean sample entropy obtained from the TAIEX series is the second highest, so that we won't expect a smaller prediction error among the ten indexes. However, the MAPEs of the nPRM for 1-step to 5-step forecasts are relatively low. The changes of TAIEX are similar to the movement which is characterized by the nPRM among the ten markets.
We summarize the performances of the methods as follows. Considering the deviation-type forecasting errors, the martingale seems to perform the best among the methods. According to the empirical evidence [2], the test based on the martingale can be used to examine the market efficiency of the stock market. Despite the absence of the test in this study, the low errors of the martingale for these ten Asian stock indexes imply that the markets tend to be efficient during the period of the data. The martingale suggests that no information we can obtain from the past to forecast the future stock prices. However, our proposed method nPRM still captures the trend of the stock movements quite well, and provides meaningful coefficients, A and κ, as well under the circumstances. If applying the nPRM to the markets that are less efficient, the model coefficients may acquire more information with economic meanings and obtain more accurate forecasts than the martingale.
The average computational times of all the investigated methods are shown in Table 6. The martingale uses the least computational time while the curve fitting spends the most. The nPRM is computationally efficient in view of the relationship between the forecasting accuracy and the computational time.

Discussion
In this study, we propose a novel procedure adapted from the thermodynamic system to model the stock prices by a newly derived model, the new Price Reversion Model (nPRM), which is a modification of PRM. From the viewpoint of signal processing, the proposed procedure of nPRM models the stock prices with all available information. Therefore, the coefficients in the nPRM preserve and reveal more information about the stock movements economically. They also may be treated as important features for forecasting by advanced models such as deep learning models. We evaluate the forecasting accuracy of the nPRM along with the martingale, and the cubic curve fitting by ten Asian stock index examples. By investigating the daily closing prices, the nPRM accurately characterizes and forecasts the large-scale motions in these ten stock markets. Although the forecasting accuracy declines over time, which is the same phenomenon for all the methods, the deviation-type errors of the nPRM are the smallest except for the martingale. However, the martingale can not be employed to forecast the trend of the stock even though it offers an eligible prediction about the prices if the market is efficient.
The trend-type errors of the nPRM and the curve fitting are close. Furthermore, the trendtype errors of the two methods do not significantly deviate from 0.5 from 1-step to 5-step forecasts. This indicates that the nPRM and the curve fitting does not possess good trend forecasting ability. In order to improve the trend forecasting in the future, we will develop a more sophisticated method to calculate the κ values without the gradient control.